WHAT TS CLAIMED IS : 

1 . An isolated nucleic acid comprising a sequence as set forth in SEQ ID NO:3 and 
variants thereof having at least about 50% identity to SEQ ID NO:3 and encoding a 
polypeptide having oc-galactosidase activity. 

2. The isolated nucleic acid of claim 1 , comprising a sequence as set forth in SEQ ID 
NO: 3, sequences substantially identical thereto, and sequences complementary 
thereto. 

3. An isolated nucleic acid that hybridizes to a nucleic acid of claim 1 under conditions 
of high stringency. 

4. An isolated nucleic acid that hybridizes to a nucleic acid of claim 1 under conditions 
of moderate stringency. 

5. An isolated nucleic acid that hybridizes to a nucleic acid of claim 1 under conditions 
of low stringency. 

6. An isolated nucleic acid having at least about 55% homology to the nucleic acid of 
claim 1 as determined by analysis with a sequence comparison algorithm. 

7. An isolated nucleic acid having at least about 60% homology to the nucleic acid of 
claim 1 as determined by analysis with a sequence comparison algorithm. 

8. An isolated nucleic acid having at least about 65% homology to the nucleic acid of 
claim 1 as determined by analysis with a sequence comparison algorithm. 

9. An isolated nucleic acid having at least 70% homology to the nucleic acid of claim 1 
as determined by analysis with a sequence comparison algorithm. 

10. An isolated nucleic acid having at least about 75% homology to the nucleic acid of 
claim 1 as determined by analysis with a sequence comparison algorithm. 

1 1. An isolated nucleic acid having at least 80% homology to the nucleic acid of claim 1 
as determined by analysis with a sequence comparison algorithm. 

12. An isolated nucleic acid having at least about 85% homology to the nucleic acid of 
claim 1 as determined by analysis with a sequence comparison algorithm. 

13. An isolated nucleic acid having at least 90% homology to the nucleic acid of claim 1 
as determined by analysis with a sequence comparison algorithm. 

14. An isolated nucleic acid having at least about 95% homology to the nucleic acid of 
claim 1 as determined by analysis with a sequence comparison algorithm. 
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15. The isolated nucleic acid of claim 1, 2, 6 ? 7, 8, 9, 10, 1 1, or 12, wherein the sequence 
comparison algorithm is FASTA version 3.0t78 with the default parameters. 

16. An isolated nucleic acid comprising at least 10 consecutive bases of SEQ ID NO: 3, 
sequences substantially identical thereto, and sequences complementary thereto. 

17. An isolated nucleic acid having at least about 50% homology to the nucleic acid of 
claim 10 as determined by analysis with a sequence comparison algorithm or FASTA 
version 3.0t78 with the default parameters. 

18. An isolated nucleic acid having at least about 55% homology to the nucleic acid of 
claim 10 as determined by analysis with a sequence comparison algorithm or FASTA 
version 3.0t78 with the default parameters. 

19. An isolated nucleic acid having at least about 60% homology to the nucleic acid of 
claim 10 as determined by analysis with a sequence comparison algorithm or FASTA 
version 3.0t78 with the default parameters. 

20. An isolated nucleic acid having at least about 65% homology to the nucleic acid of 
claim 10 as determined by analysis with a sequence comparison algorithm or FASTA 
version 3.0t78 with the default parameters. 

21. An isolated nucleic acid having at least 70% homology to the nucleic acid of claim 10 
as determined by analysis with a sequence comparison algorithm or FASTA version 
3.0t78 with the default parameters. 

22. An isolated nucleic acid encoding a polypeptide having a sequence as set forth in 
SEQ ID NO: 4, and sequences substantially identical thereto. 

23. An isolated nucleic acid encoding a polypeptide comprising at least 10 consecutive 
amino acids of a polypeptide having a sequence as set forth in of SEQ ID NO: 4, and 
sequences substantially identical thereto. 

24. A purified polypeptide substantially identical to the polypeptide of claim 22 or 23 as 
determined by analysis with a sequence comparison algorithm or FASTA version 
3.0t78 with the default parameters. 

25. A purified polypeptide having at least about 50% homology to the polypeptide of 
claim 22 or 23 as determined by analysis with a sequence comparison algorithm or 
FASTA version 3.0t78 with the default parameters. 
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26. A purified polypeptide having at least about 55% homology to the polypeptide of 
claim 22 or 23 as determined by analysis with a sequence comparison algorithm or 
FASTA version 3.0t78 with the default parameters. 

27. A purified polypeptide having at least about 60% homology to the polypeptide of 
claim 22 or 23 as determined by analysis with a sequence comparison algorithm or 
FASTA version 3.0t78 with the default parameters. 

28. A purified polypeptide having at least about 65% homology to the polypeptide of 
claim 22 or 23 as determined by analysis with a sequence comparison algorithm or 
FASTA version 3.0t78 with the default parameters. 

29. A purified polypeptide having at least 70% homology to the polypeptide of claim 22 
or 23 as determined by analysis with a sequence comparison algorithm or FASTA 
version 3.0t78 with the default parameters. 

30. A purified polypeptide having at least about 75% homology to the polypeptide of 
claim 22 or 23 as determined by analysis with a sequence comparison algorithm or 
FASTA version 3.0t78 with the default parameters. 

3 1 . A purified polypeptide having at least 80% homology to the polypeptide of claim 22 
or 23 as determined by analysis with a sequence comparison algorithm or FASTA 
version 3.0t78 with the default parameters. 

32. A purified polypeptide having at least about 85% homology to the polypeptide of 
claim 22 or 23 as determined by analysis with a sequence comparison algorithm or 
FASTA version 3.0t78 with the default parameters. 

33. A purified polypeptide having at least about 90% homology to the polypeptide of 
claim 22 or 23 as determined by analysis with a sequence comparison algorithm or 
FASTA version 3.0t78 with the default parameters. 

34. A purified polypeptide having at least about 95% homology to the polypeptide of 
claim 22 or 23 as determined by analysis with a sequence comparison algorithm or 
FASTA version 3.0t78 with the default parameters. 

35. A purified polypeptide having a sequence as set forth in SEQ ED NO: 4 and 
sequences substantially identical thereto. 

36. A purified antibody that specifically binds to a polypeptide comprising a sequence as 
set forth in SEQ ID NO: 4, and sequences substantially identical thereto. 
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37. A purified antibody that specifically binds to a polypeptide having at least 10 
consecutive amino acids of the polypeptides as set forth in SEQ ID NO: 4, and 
sequences substantially identical thereto. 

38. The antibody of claim 36 or 37, wherein the antibodies are polyclonal. 

39. The antibody of claim 36 or 37, wherein the antibodies are monoclonal. 

40. A method of producing a polypeptide having a sequence as set forth in SEQ ID NO: 
4, and sequences substantially identical thereto comprising introducing a nucleic acid 
encoding the polypeptide into a host cell under conditions that allow expression of the 
polypeptide and recovering the polypeptide. 

41. A method of producing a polypeptide comprising at least 10 amino acids of a 
sequence as set forth in SEQ ID NO: 4, and sequences substantially identical thereto 
comprising introducing a nucleic acid encoding the polypeptide, operably linked to a 
promoter, into a host cell under conditions that allow expression of the polypeptide 
and recovering the polypeptide. 

42. A method of generating a variant comprising: 

obtaining a nucleic acid comprising a sequence as set forth in SEQ ID NO: 3, 
sequences substantially identical thereto, sequences complementary thereto, 
fragments comprising at least 30 consecutive nucleotides thereof, and fragments 
comprising at least 30 consecutive nucleotides of the sequences complementary to 
SEQ ID NO: 3; and 

modifying one or more nucleotides in said sequence to another nucleotide, 
deleting one or more nucleotides in said sequence, or adding one or more nucleotides 
to said sequence. 

43. The method of claim 42, wherein the modifications are introduced by a method 
selected from the group consisting of error-prone PCR, shuffling, oligonucleotide- 
directed mutagenesis, assembly PCR, sexual PCR mutagenesis, in vivo mutagenesis, 
cassette mutagenesis, recursive ensemble mutagenesis, exponential ensemble 
mutagenesis, site-specific mutagenesis, gene reassembly, gene site saturated 
mutagenesis and any combination thereof. 

44. The method of claim 42, wherein the modifications are introduced by error-prone 
PCR. 
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45. The method of claim 42, wherein the modifications are introduced by shuffling. 

46. The method of claim 42, wherein the modifications are introduced by 
oligonucleotide-directed mutagenesis. 

47. The method of claim 42, wherein the modifications are introduced by assembly PCR. 

48. The method of claim 42, wherein the modifications are introduced by sexual PCR 
mutagenesis. 

49. The method of claim 42, wherein the modifications are introduced by in vivo 
mutagenesis. 

50. The method of claim 42, wherein the modifications are introduced by cassette 
mutagenesis. 

5 1 . The method of claim 42, wherein the modifications are introduced by recursive 
ensemble mutagenesis. 

52. The method of claim 42, wherein the modifications are introduced by exponential 
ensemble mutagenesis. 

53. The method of claim 42, wherein the modifications are introduced by site-specific 
mutagenesis. 

54. The method of claim 42, wherein the modifications are introduced by gene 
reassembly. 

55. The method of claim 42, wherein the modifications are introduced by gene site 
saturated mutagenesis. 

56. A computer readable medium having stored thereon a nucleic acid sequence as set 
forth in SEQ ID NO: 3, and sequences substantially identical thereto, or a polypeptide 
sequence as set forth in SEQ ID NO: 4, and sequences substantially identical thereto. 

57. A computer system comprising a processor and a data storage device wherein said data 
storage device has stored thereon a nucleic acid sequence as set forth in SEQ ID NO: 3, 
and sequences substantially identical thereto, or a polypeptide sequence as set forth in 
SEQ ID NO: 4, and sequences substantially identical thereto. 

58. The computer system of claim 45, further comprising a sequence comparison algorithm 
and a data storage device having at least one reference sequence stored thereon. 

59. The computer system of claim 58, wherein the sequence comparison algorithm 
comprises a computer program which indicates polymorphisms. 
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60. The computer system of claim 57, further comprising an identifier which identifies 
features in said sequence. 

61 . A method for comparing a first sequence to a reference sequence wherein said first 
sequence is a nucleic acid sequence as set forth in SEQ ED NO: 3, and sequences 
substantially identical thereto, or a polypeptide sequence as set forth in SEQ ID NO: 4, 
and sequences substantially identical thereto comprising: 

reading the first sequence and the reference sequence through use of a computer 
program which compares sequences; and 

determining differences between the first sequence and the reference sequence 
with the computer program. 

62. The method of claim 61, wherein determining differences between the first sequence 
and the reference sequence comprises identifying polymorphisms. 

63. A method for identifying a feature in a sequence wherein the sequence is as set forth in 
SEQ ID NO: 3, sequences substantially identical thereto, or a polypeptide sequence as 
set forth in SEQ ID NO: 4, and sequences substantially identical thereto comprising: 

reading the sequence through the use of a computer program which identifies 
features in sequences; and 

identifying features in the sequences with the computer program. 

64. A purified polypeptide of claim 1, wherein the polypeptide is a thermostable enzyme 
which is stable to heat, is heat resistant and catalyzes the enzymatic hydrolysis of 
saccharides, and wherein the enzyme is able to renature and regain activity after 
exposure to temperatures of from about 60 degrees C to 105 degrees C. 

65. A method of catalyzing the hydrolysis of saccharides comprising contacting a sample 
containing saccharides with a polypeptide selected from the group consisting of SEQ 
ID NO: 4 and sequences having at least 50% homology and having a-galactosidase 
enzyme activity under conditions with facilitate the hydrolysis of the saccharides. 

66. An assay for identifying functional polypeptide fragments or variants encoded by 
fragments of SEQ ID NO: 3, and sequences substantially identical thereto, which 
retain the enzymatic function of the polypeptides of SEQ ID NO: 4, and sequences 
substantially identical thereto, said assay comprising: 
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contacting the polypeptide of SEQ ID NO: 4, and sequences substantially identical 
thereto, or polypeptide fragment or variant encoded by SEQ ID NO: 3, with a 
substrate molecule under conditions which allow said polypeptide or fragment or 
variant to function, and 

detecting either a decrease in the level of substrate or an increase in the level of 
the specific reaction product of the reaction between said polypeptide and substrate, 
wherein a decrease in the level of substrate or an increase in the level of the reaction 
product is indicative of a functional polypeptide or fragment or variant. 

67. A nucleic acid probe comprising an oligonucleotide from about 10 to 50 nucleotides 
in length and having an area of at least 10 contiguous nucleotides that is at least 50 % 
complementary to a nucleic acid target region of the nucleic acid sequence set forth in 
SEQ ID NO:3 and which hybridizes to the nucleic acid target region under moderate 
to highly stringent conditions to form a detectable targetprobe duplex. 

68. The probe of claim 67, wherein the oligonucleotide is DNA. 

69. The probe of claim 67, which is at least 55% complementary to the nucleic acid target 
region. 

70. The probe of claim 67, which is at least 60% complementary to the nucleic acid target 
region. 

71. The probe of claim 67, which is at least 65% complementary to the nucleic acid target 
region. 

72. The probe of claim 67, which is at least 70% complementary to the nucleic acid target 
region. 

73. The probe of claim 67, which is at least 75% complementary to the nucleic acid target 
region. 

74. The probe of claim 67, wherein the oligonucleotide comprises a sequence which is 
80% complementary to the nucleic acid target region. 

75. The probe of claim 67, which is at least 85% complementary to the nucleic acid target 
region. 

76. The probe of claim 67, wherein the oligonucleotide comprises a sequence which is 
90% complementary to the nucleic acid target region. 
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77. The probe of claim 67, which is at least 95% complementary to the nucleic acid target 
region. 

78. The probe of claim 67, which is fully complementary to the nucleic acid target region. 

79. The probe of claim 67, wherein the oligonucleotide is 15-50 bases in length. 

80. The probe of claim 67, wherein the probe further comprises a detectable isotopic 
label. 

81. The probe of claim 67, wherein the probe further comprises a detectable non-isotopic 
label selected from the group consisting of a fluorescent molecule, a 
chemiluminescent molecule, an enzyme, a cofactor, an enzyme substrate, and a 
hapten. 

82. A nucleic acid probe comprising an oligonucleotide from about 15 to 50 nucleotides 
in length and having an area of at least 15 contiguous nucleotides that is at least 90% 
complementary to a nucleic acid target region of the nucleic acid sequence set forth in 
SEQ ID NO: 3 and which hybridizes to the nucleic acid target region under moderate 
to highly stringent conditions to form a detectable targetprobe duplex. 

83. A nucleic acid probe comprising an oligonucleotide from about 15 to 50 nucleotides 
in length and having an area of at least 15 contiguous nucleotides that is at least 95% 
complementary to a nucleic acid target region of the nucleic acid sequence set forth in 
SEQ ID NO:3 and which hybridizes to the nucleic acid target region under moderate 
to highly stringent conditions to form a detectable targetprobe duplex. 

84. A nucleic acid probe comprising an oligonucleotide from about 15 to 50 nucleotides 
in length and having an area of at least 15 contiguous nucleotides that is at least 97% 
complementary to a nucleic acid target region of the nucleic acid sequence set forth in 
SEQ ID NO:3 and which hybridizes to the nucleic acid target region under moderate 
to highly stringent conditions to form a detectable targetprobe duplex. 

85. A polynucleotide probe for isolation or identification of a-galactosidase genes having 
a sequence which is the same as or fully complementary to at least a portion of SEQ 
IDNO:3. 

86. An enzyme preparation comprising a polypeptide of any one of claims 17 or 25 which 
is liquid. 
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87. An enzyme preparation comprising the polypeptide of any one of claims 17 or 25 
which is dry. 

88. A method for modifying small molecules, comprising mixing a polypeptide encoded 
by a polynucleotide of claim 1 or fragments thereof with a small molecule to produce 

5 a modified small molecule. 

89. The method of claim 88 wherein a library of modified small molecules is tested to 
determine if a modified small molecule is present within the library which exhibits a 
desired activity. 

90. The method of claim 89 wherein a specific biocatalytic reaction which produces the 

1 0 modified small molecule of desired activity is identified by systematically eliminating 
each of the biocatalytic reactions used to produce a portion of the library, and then 
testing the small molecules produced in the portion of the library for the presence or 

?D absence of the modified small molecule with the desired activity. 

§3 91 . The method of claim 90 wherein the specific biocatalytic reactions which produce the 

11 modified small molecule of desired activity is optionally repeated, 
p 92. The method of Claim 90 or 91 wherein 

(a) the biocatalytic reactions are conducted with a group of biocatalysts that react 
;^ with distinct structural moieties found within the structure of a small molecule, 

fy (b) each biocatalyst is specific for one structural moiety or a group of related 

29 structural moieties; and 

(c) each biocatalyst reacts with many different small molecules which contain the 
distinct structural moiety. 
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